Skip to content

Conversation

@shwina
Copy link
Contributor

@shwina shwina commented Nov 26, 2025

Closes #3749.

Following the study #3748, this PR adds support for ak.sort() for the CUDA backend, using https://nvidia.github.io/cccl/python/compute.html.

⚠️ This PR should not be merged in its current state - I'm opening it as a POC at this time.

@codecov
Copy link

codecov bot commented Nov 26, 2025

Codecov Report

❌ Patch coverage is 16.00000% with 42 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.60%. Comparing base (b749e49) to head (40e1839).
⚠️ Report is 474 commits behind head on main.

Files with missing lines Patch % Lines
src/awkward/_connect/cuda/_compute.py 0.00% 24 Missing ⚠️
src/awkward/_backends/cupy.py 22.22% 14 Missing ⚠️
src/awkward/_kernels.py 50.00% 4 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/awkward/_kernels.py 69.78% <50.00%> (+0.41%) ⬆️
src/awkward/_backends/cupy.py 47.61% <22.22%> (-19.05%) ⬇️
src/awkward/_connect/cuda/_compute.py 0.00% <0.00%> (ø)

... and 198 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@shwina shwina changed the title Add ak.sort() for CUDA backend feat: Add ak.sort() for CUDA backend Nov 26, 2025
@github-actions
Copy link

The documentation preview is ready to be viewed at http://preview.awkward-array.org.s3-website.us-east-1.amazonaws.com/PR3750

Copy link
Collaborator

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shwina - this is great! Thanks for implementing it. Let’s keep it open and discuss.

assert virtual_numpyarray.is_all_materialized


@pytest.mark.xfail(reason="awkward_sort is not implemented")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shwina - we should probably keep the virtual arrays out of POC. Our design goal is to support delayed (lazy) evaluation of operations on virtual arrays.
This ensures that array transformations are represented symbolically until explicitly
materialized.

Copy link
Collaborator

@pfackeldey pfackeldey Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with keeping virtual arrays out of this for now, I think adding this later is an easy step as we can just explicitly "maybe_materialize" when calling CCCL kernels.

Just to clarify: Virtual arrays are not for delaying kernels/operations. It's only to lazify the IO when using ak.from_buffers to create an ak.Array. Once the buffer has been delivered by the IO source, everything is eager (there's no symbolic representation of the program or any compute graph here).

It's not the goal of virtual arrays to create a symbolic representation of the program. That doesn't exist in awkward so far. It was what we've done over in https://github.com/dask-contrib/dask-awkward at the level of "highlevel Arrays". In principle it's interesting to have this though, but then it should likely be at the level of buffers and kernels that act on those.

Copy link
Collaborator

@ikrommyd ikrommyd Nov 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the ci errors btw for virtual arrays, the tests seem to be hitting this cupy/cupy#9089 which has been fixed but is to be out in cupy 14.
I see

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I think if it wasn't for this problem, they would probably pass too. So yeah...they can't be enabled until cupy 14 is out at least.

else:
raise AssertionError(f"CuPyKernel not found: {index!r}")

# CuPy kernel not found, try cuda.compute as fallback
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should prioritize CCCL kernels for GPU operations, with CuPy serving as a fallback when CCCL is unavailable?

Comment on lines 186 to 218
strategy:
fail-fast: false
matrix:
cuda-version:
- 11
- 12
- 13

steps:
- name: Clean the workspace and mamba
run: |
rm -rf * .[!.]* || echo "Nothing to clean"
rm -rf ~/micromamba* || echo "Nothing to clean"
- uses: actions/checkout@v6
with:
submodules: true

- name: Get micromamba
uses: mamba-org/setup-micromamba@v2
with:
environment-name: test-env
init-shell: bash
create-args: >-
-c rapidsai
-c nvidia
python=3.13
cccl-python
cudf
cupy
cuda-version=${{ matrix.cuda-version }}
cuda-toolkit
cxx-compiler
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
strategy:
fail-fast: false
matrix:
cuda-version: [11, 12, 13]
steps:
- name: Clean the workspace and mamba
run: |
rm -rf * .[!.]* || echo "Nothing to clean"
rm -rf ~/micromamba* || echo "Nothing to clean"
- uses: actions/checkout@v6
with:
submodules: true
- name: Get micromamba (CUDA <= 11)
if: ${{ matrix.cuda-version < 12 }}
uses: mamba-org/setup-micromamba@v2
with:
environment-name: test-env
init-shell: bash
create-args: >-
-c rapidsai
-c nvidia
python=3.13
cudf
cupy
cuda-version=${{ matrix.cuda-version }}
cuda-toolkit
cxx-compiler
- name: Get micromamba (CUDA >= 12, add cccl-python)
if: ${{ matrix.cuda-version >= 12 }}
uses: mamba-org/setup-micromamba@v2
with:
environment-name: test-env
init-shell: bash
create-args: >-
-c rapidsai
-c nvidia
python=3.13
cccl-python
cudf
cupy
cuda-version=${{ matrix.cuda-version }}
cuda-toolkit
cxx-compiler

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a hopefully correct fixed matrix that installs cccl-python only when CUDA ≥ 12, and avoids the solver error when CUDA = 11.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ariostas - given that we still want to use CUDA 11.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing ak.sort() for CUDA backend

4 participants